A regression tree approach using mathematical programming
نویسندگان
چکیده
Regression analysis is a machine learning approach that aims to accurately predict the value of continuous output variables from certain independent input variables, via automatic estimation of their latent relationship from data. Tree-based regression models are popular in literature due to their flexibility to model higher order non-linearity and great interpretability. Conventionally, regression tree models are trained in a two-stage procedure, i.e. recursive binary partitioning is employed to produce a tree structure, followed by a pruning process of removing insignificant leaves, with the possibility of assigning multivariate functions to terminal leaves to improve generalisation. This work introduces a novel methodology of node partitioning which, in a single optimisation model, simultaneously performs the two tasks of identifying the break-point of a binary split and assignment of multivariate functions to either leaf, thus leading to an efficient regression tree model. Using six real world benchmark problems, we demonstrate that the proposed method consistently outperforms a number of state-of-the-art regression tree models and methods based on other techniques, with an average improvement of 7–60% on the mean absolute errors (MAE) of the predictions. © 2017 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license. ( http://creativecommons.org/licenses/by/4.0/ )
منابع مشابه
Shuffled Frog-Leaping Programming for Solving Regression Problems
There are various automatic programming models inspired by evolutionary computation techniques. Due to the importance of devising an automatic mechanism to explore the complicated search space of mathematical problems where numerical methods fails, evolutionary computations are widely studied and applied to solve real world problems. One of the famous algorithm in optimization problem is shuffl...
متن کاملAn Integrated DEA and Data Mining Approach for Performance Assessment
This paper presents a data envelopment analysis (DEA) model combined with Bootstrapping to assess performance of one of the Data mining Algorithms. We applied a two-step process for performance productivity analysis of insurance branches within a case study. First, using a DEA model, the study analyzes the productivity of eighteen decision-making units (DMUs). Using a Malmquist index, DEA deter...
متن کاملA Mixed Integer Programming Approach to Optimal Feeder Routing for Tree-Based Distribution System: A Case Study
A genetic algorithm is proposed to optimize a tree-structured power distribution network considering optimal cable sizing. For minimizing the total cost of the network, a mixed-integer programming model is presented determining the optimal sizes of cables with minimized location-allocation cost. For designing the distribution lines in a power network, the primary factors must be considered as m...
متن کاملMathematical solution of multilevel fractional programming problem with fuzzy goal programming approach
In this paper, we show a procedure for solving multilevel fractional programming problems in a large hierarchical decentralized organization using fuzzy goal programming approach. In the proposed method, the tolerance membership functions for the fuzzily described numerator and denominator part of the objective functions of all levels as well as the control vectors of the higher level decision ...
متن کاملOPTIMIZATION OF TREE-STRUCTURED GAS DISTRIBUTION NETWORK USING ANT COLONY OPTIMIZATION: A CASE STUDY
An Ant Colony Optimization (ACO) algorithm is proposed for optimal tree-structured natural gas distribution network. Design of pipelines, facilities, and equipment systems are necessary tasks to configure an optimal natural gas network. A mixed integer programming model is formulated to minimize the total cost in the network. The aim is to optimize pipe diameter sizes so that the location-alloc...
متن کاملTowards Supply Chain Planning Integration: Uncertainty Analysis Using Fuzzy Mathematical Programming Approach in a Plastic Forming Company
Affected by globalization and increased complexity, supply chain managers have learned about the importance of Sales and Operations Planning (S&OP). However, in large scale supply chains, S&OP has received little attention, by both academics and practitioners. The purpose of this manuscript is to investigate the advantages of S&OP process using a mathematical modeling approach in a large scale ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Expert Syst. Appl.
دوره 78 شماره
صفحات -
تاریخ انتشار 2017